Coreference Resolution for the Basque Language with BART

نویسندگان

  • Ander Soraluze
  • Olatz Arregi Uriarte
  • Xabier Arregi
  • Arantza Díaz de Ilarraza
  • Mijail A. Kabadjov
  • Massimo Poesio
چکیده

In this paper we present our work on Coreference Resolution in Basque, a unique language which poses interesting challenges for the problem of coreference. We explain how we extend the coreference resolution toolkit, BART, in order to enable it to process Basque. Then we run four different experiments showing both a significant improvement by extending a baseline feature set and the effect of calculating performance of hand-parsed mentions vs. automatically parsed mentions. Finally, we discuss some key characteristics of Basque which make it particularly challenging for coreference and draw a road map for future work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

Corefrence resolution with deep learning in the Persian Labnguage

Coreference resolution is an advanced issue in natural language processing. Nowadays, due to the extension of social networks, TV channels, news agencies, the Internet, etc. in human life, reading all the contents, analyzing them, and finding a relation between them require time and cost. In the present era, text analysis is performed using various natural language processing techniques, one ...

متن کامل

Coreference Resolution for Morphologically Rich Languages. Adaptation of the Stanford System to Basque

This paper presents the adaptation of the Stanford coreference resolution system to Basque, an agglutinative head-final pro-drop language. The adapted system has been integrated into a global linguistic analysis pipeline so that the input of the system are original Basque raw texts linguistically processed, and annotated. We demonstrate that language-specific characteristics have a noteworthy e...

متن کامل

Creating a Coreference Resolution System for Italian

This paper summarizes our work on creating a full-scale coreference resolution (CR) system for Italian, using BART – an open-source modular CR toolkit initially developed for English corpora. We discuss our experiments on language-specific issues of the task. As our evaluation experiments show, a language-agnostic system (designed primarily for English) can achieve a performance level in high f...

متن کامل

Extending BART to Provide a Coreference Resolution System for German

We present a flexible toolkit-based approach to automatic coreference resolution on German text. We start with our previous work aimed at reimplementing the system from Soon et al. (2001) for English, and extend it to duplicate a version of the state-of-the-art proposal from Klenner and Ailloud (2009). Evaluation performed on a benchmarking dataset, namely the TüBa-D/Z corpus (Hinrichs et al., ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016